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An important issue in survival analysis is the investigation and 
the modeling of hazard rates. Within a Bayesian nonparametric frame- 
work, a natural and popular approach is to model hazard rates as 
kernel mixtures with respect to a completely random measure. In 
this paper we provide a comprehensive analysis of the asymptotic 
behavior of such models. We investigate consistency of the posterior 
distribution and derive fixed sample size central limit theorems for 
both linear and quadratic functionals of the posterior hazard rate. 
The general results are then specialized to various specific kernels 
and mixing measures yielding consistency under minimal conditions 
and neat central limit theorems for the distribution of functionals. 

1. Introduction. Bayesian nonparametric methods have found a fertile 
ground of applications within survival analysis. Indeed, given that survival 
analysis typically requires function estimation, the Bayesian nonparametric 
paradigm seems to be tailor made for such problems, as already shown in 
the seminal papers by Doksum [4], Dykstra and Laud [6], Lo and Weng 
[24] and Hjort [11]. According to the approach of [6, 24], the hazard rate 
is modeled as a mixture of a suitable kernel with respect to an increasing 
additive process (see [32]) or, more generally, a completely random measure 
(see [21]). This approach will be the focus of the present paper: below we 
first present the model and, then, the two asymptotic issues we are going 
to tackle, namely weak consistency and the derivation of fixed sample size 
central limit theorems (CLTs) for functionals of the posterior hazard rate. 
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1.1. Life-testing model with mixture hazard rate. Denote by Y a positive 
absolutely continuous random variable representing the lifetime and assume 
that its random hazard rate is of the form 

(1) h{t) = / k(t,x)ji(dx), 

where k is a kernel and fi a completely random measure on some Polish space 
X endowed with its Borel cr-field J?f . The kernel A; is a jointly measurable 
application from M + x X to M + and the application C i— ► J c k(t, x) dt defines 
a cr-finite measure on for any x in X. Typical choices, which we will 

also consider in this paper, are: 

(i) the Dykstra-Laud (DL) kernel [6] 

(2) k(t,x) =l(o<x<t), 

which leads to monotone increasing hazard rates; 

(ii) the rectangular kernel (see, e.g., [13]) with bandwidth r > 

(3) k(t,x) =I(|t- x |<r); 

(hi) the Ornstein-Uhlenbeck (OU) kernel (see, e.g., [25, 26]) with k > 

(4) k(t,x) = v / 2Kexp(-K(t-x))I( < a: < 4 ); 
(iv) the exponential kernel (see, e.g., [14]) 

(5) k(t,x)=x- 1 e- t/x , 

which yields monotone decreasing hazard rates. 

As for the mixing measure in (1), letting (M, 3§(M)) be the space of 
boundedly finite measures on (X, JT), fl is taken to be a completely random 
measure (CRM) in the sense of [21]. This means that jl is a random element 
defined on (f2,J£~, P), taking values in (M, ^(M)) and such that, for any 
collection of disjoint sets, B\,B2, ■ ■ ■ , the random variables fl(Bi) , p,(B2) , ■ ■ ■ 
are mutually independent. Appendix A.l provides a brief account of CRMs, 
as well as justifications of the following statements. It is important to recall 
that a CRM is characterized by its Poisson intensity u, which we can write 
as 

(6) v(dv,dx) = p(dv\x)X(dx), 

where A is a c-finite measure on X. If, furthermore, u(dv,dx) = p(dv)X(dx), 
the corresponding CRM fl is termed homogeneous, otherwise it is said to 
be nonhomogeneous. We always consider kernels such that J x k(t,x)X(dx) < 
+oo. Throughout the paper, we will take v and A to be nonatomic and we 
shall moreover assume that 

(HI) p(M + |x) = +oo a.e.-A and supp(A) = X, 
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where supp(r) indicates the topological support of a given measure r. Note 
that (HI) is equivalent to requiring that jl jumps infinitely often on any 
bounded set of positive A-measure and is indeed a desirable property for a 
mixing measure, since it ensures that the topological support of jl is the 
whole space M. See also the discussion around formula (3.22) in [18] for an 
account of the usefulness of (HI) for inferential purposes. In the examples 
we will focus on a large class of CRMs, which includes almost all CRMs 
used so far in applications and is characterized by an intensity measure of 
the type 

1 e -7(2> 

(7) v(dv,dx) = — dvX(dx), 

1(1 — 0") v L+a 

where o £ [0, 1) and 7 is a strictly positive function on X. Note that, if 
7 is a constant, the resulting CRMs coincide with the generalized gamma 
measures [2], whereas when o = they are extended gamma CRMs [6, 24]. 

Having defined the ingredients of the mixture hazard (1), we can complete 
the description of the model, which is often referred to as life-testing model. 
The cumulative hazard is then given by H (t) = Jq h(s) ds and, provided 

(8) H(t) — ► 00 for t — > +00 a.s., 
one can define a random density function / as 

(9) f(t) = h(t) exp(-#(t)) = h(t)S(t), 

where S(t) := exp(— H (t)) is the survival function, providing the probability 
that Y > t. Consequently, the random cumulative distribution function of Y 
is of the form F(t) = 1 — exp(— H(t)). Note that, given jl, h represents the 
hazard rate of Y, that is, h(t)dt = F(t<Y <t + dt\Y > t,ji). Throughout 
the paper we will assume that 

(H2) E[H(t)]= f f vk{u,x)p{dv\x)X{dx)du < +00 V< > 0. 

JO JR+xX 

Such models have recently received much attention due to their rela- 
tively simple implementation in applications. Important developments, deal- 
ing also with more general multiplicative intensity models, can be found in 
[12, 13, 14, 15, 25, 26], among others. 

1.2. Posterior consistency. The study of consistency of Bayesian non- 
parametric procedures represents one of the main recent research topics in 
Bayesian theory. The "frequentist" (or "what if" ) approach to Bayesian con- 
sistency consists of generating independent data from a "true" fixed density 
/o and checking whether the sequence of posterior distributions accumulates 
in some suitable neighborhood of /o- Specifically, denote by Pq the probabil- 
ity distribution associated with /q and by Pq° the infinite product measure. 
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Moreover, the symbol F indicates the space of density functions absolutely 
continuous with respect to the Lebesgue measure on IR, endowed with the 
Borel (T-field <S^(F) (with respect to an appropriate L 1 -topology). Now, if IT 
is the prior distribution of some random density function /, taking values 
in F, and U n denotes its posterior distribution, then one is interested in 
establishing sufficient conditions to have that, as n — > +00, for any e > 

(10) n n (^l e (/ ))^l a.s.-P °°, 

where A £ (fo) represents a e- neighborhood of /o in a suitable topology. If 
(10) holds, then IT is said to be consistent at /o- Now, if -A e (/o) is chosen to 
be a weak neighborhood, one obtains weak consistency. Sufficient conditions 
for weak consistency of various important nonparametric models have been 
provided in, for example, [8, 33, 35, 37]. By requiring (10) to hold with A £ 
being a Li-neighborhood, one obtains the stronger notion of L\ consistency: 
general sufficient conditions for this to happen are provided in [1, 8, 36]. 
In the context of discrete models such as neutral to the right processes, 
posterior consistency has been studied in [9, 19, 20]. For a thorough review of 
the literature on consistency issues, the reader is referred to the monograph 
[10]. 

Turning back to the life-testing model defined by (1) and (9), little is 
known about consistency, since their structure is intrinsically very differ- 
ent from the models considered so far. First results were given in [5, 25]. 
In particular, in [5] consistency is established for the DL kernel with ex- 
tended gamma mixing measure assuming a bounded "true" hazard. In this 
paper, we determine sufficient conditions for weak consistency of Bayesian 
nonparametric models defined in terms of mixture random hazard rates. We 
also cover the case of lifetimes subject to independent right-censoring. Then, 
we use this general result for establishing weak consistency for mixture haz- 
ards with the specific kernels in (2)-(5) and CRMs characterized by (7). 
In particular, we obtain consistency essentially w.r.t. nondecreasing hazards 
for DL mixtures, w.r.t. bounded Lipschitz hazards for rectangular mixtures, 
w.r.t. to hazards with certain local exponential decay rate for OU mixtures 
and w.r.t. completely monotone hazards for exponential mixtures. 

1.3. Functionals of the posterior mixture hazard rate. The second as- 
pect we investigate is the asymptotic behavior (in the sense of larger and 
larger time horizons) of functionals of the posterior random hazard rate 
given a fixed number of observations. We shall focus on functionals of sta- 
tistical relevance, such as means, path-second moments and path-variances. 
Indeed, any CLT involving this type of functionals may be used to derive a 
synthetic — yet highly informative — picture of the "global shape" of a given 
(prior or posterior) hazard rate model. In particular, as we will see below, 
CLTs for linear and quadratic functionals contain specific information about 
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the trend, the oscillations and the overall asymptotic variance of a random 
object such as (1). This represents an important issue since, though widely 
used in practice, the implications of the choice of specific kernels and CRMs 
in defining (1) are generally not well understood and their choice is based 
on mere empirical considerations. In [27] functionals of the prior hazard rate 
are considered: the results, despite being of theoretical relevance, can serve 
also as a guide for prior specification. For instance, it is shown that the trend 
of the cumulative hazard with a DL kernel (2) with a homogeneous CRM 
is T , with the oscillations around the trend increasing like T 3 / 2 , wher cas 
with a rectangular kernel the trend is T and the oscillations increase like 
T 1 / 2 . Moreover, the parameters of the kernel and the CRM enter the vari- 
ance of the asymptotic Gaussian random variable, thus leading to a rigorous 
procedure for their a priori selection. 

Here, we face the more challenging problem of deriving CLTs for the pos- 
terior hazard rate: indeed, the model defined by (1) and (9) is not conjugate 
and, hence, the derivation of distributional results for posterior functionals 
is quite demanding. However, by exploiting the posterior representation of 
James [15] (to be detailed in Section 2), we are able to provide fixed sample 
size CLTs also for functionals of posterior hazard rates. One of our main 
findings is that, in all the considered special cases, the CLTs associated with 
the posterior hazard rate are the same as for the prior ones, and this for 
any number of observations. If one interprets CLTs as approximate "global 
pictures" of a model, the conclusions to be drawn from our results are quite 
clear. Indeed, although consistency implies that a given model can be asymp- 
totically directed toward any deterministic target, the overall structure of a 
posterior hazard rate is systematically determined by the prior choice, even 
after conditioning on a very large number of observations. 

As an example of the results derived in the sequel, consider again the 
hazard rate given by the DL kernel (2) with a homogeneous CRM, and let 
Y = (Y\, . . . , Y n ) be a set of observations. In Section 4.3.1, we will prove that 

T-V 2 [H(T)-cT 2 ]\Y ^ X 

(the precise meaning of such a conditional convergence in law will be clarified 
in the sequel), where c is a constant and X is a centered Gaussian random 
variable with variance a 2 . As anticipated, the crucial point will be that 
both c and a 2 are independent of n and Y, and that they are actually the 
same constants appearing in the prior CLTs proved in [27] . A more detailed 
illustration of these phenomena is provided in Section 4.3, where we also 
discuss analogous results involving other models, as well as limit theorems 
for quadratic functionals. 

We stress that our choice of +oo as a limiting point is mainly conven- 
tional, and that one can easily modify our framework to deal with models 
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that live within a finite window of time by using an appropriate deforma- 
tion of the time scale. For instance, one can embed a hazard rate model 
defined on [0, +00) into a finite time interval, by substituting the time pa- 
rameter T in the previous discussion with an increasing function of the type 
log [T*/(T* - T)], where T* < +00 and < T < T* . 

1.4. Outline. The paper is organized as follows. Section 2 provides the 
posterior characterization of model (1). In Section 3 sufficient conditions for 
weak consistency are established. Section 4 deals with posterior linear and 
quadratic functionals of the mixture hazard. The results are illustrated by 
various examples involving specific kernels and CRMs. In Section 5 some 
concluding remarks and future research lines are presented. Further results, 
which are also of independent interest, and the proofs are deferred to the 
Appendix. 

2. Posterior distribution of the random hazard rate. In order to make 
Bayesian inference starting from model (1), an explicit posterior character- 
ization is essential. Indeed, the first treatments of model (1) were limited 
to considering extended gamma CRMs, which allow for a relatively simple 
posterior characterization [6, 24]. Analysis beyond gamma-like choices of fx 
has not been possible for a long time due to the lack of a suitable and im- 
plementable posterior characterization: however, in James [15] this goal has 
been achieved and many choices for fx can now be explored. See also [23] for 
a different derivation of these results. In what follows, we give an explicit 
description of the posterior characterization of the model (1). 

Let Pj be the random probability measure associated with (9) and denote 
by (Y n ) n >i a sequence of exchangeable observations, defined on (fi,«^,P) 
and taking values in such that, given Pj, the Y n 's are i.i.d. with dis- 
tribution Pj, that is, P[Yi G B u . . . ,Y n G B n \Pf\ = U?=i Pf{Bi) for any B t G 
i = 1, ...,n and n > 1. The joint (conditional) density of Y = 
(Yi, . . . , Y n ) given jl = li is then given by 

In this context it is important to consider also some censoring mechanism, 
specifically independent right-censoring. Hence, suppose there are addition- 
ally Y n+ \, . . . , Y m random times which are right censored by censoring times 
C n+ i, . . . , C m , that is, Yi > Ci for i = n + 1, . . . , m [by exchangeability, it 
would be equivalent to assume the right censored data to be an arbitrary 
(m — n)-dimensional subvector of (Y\, . . . , Y m )]. It is well known that as- 
suming the distribution of C to be known is equivalent to assuming the 
distribution of C is a priori independent of the distribution of Y . Hence, 
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the posterior distribution of fl may be obtained without even specifying 
the prior on the distribution of C. Then the likelihood function based on 
Y = (Y\, . . . , Y m ), where the vector Y is composed of n completely observed 
times and m — n right censored times, has the form 

(11) Sf(p;y) = e-Jx K ^^H k( yi ,x)n(dx), 

where K m (x) = YhLi Jo* ^ k(t, x) dt and we set Cj = oo for i = 1, . . . ,n. If 
we now augment the likelihood with respect to the latent variables X = 
(Xi, . . . ,X n ), (11) reduces to 

r n 



i=l 
k 

3=1 i£Dj 



e-^ K ^ x ^f[p(dx*)^ J] k(y f , 



where X* = (X\, . . . , X%) denote the k<n distinct latent variables, nj is 
the frequency of X* and Dj = {r:x r = x*}. Finally, set r nj (x) = 
J m+ v nj e~ vKm ^ p(dv\x). We are now in a position to state the posterior 
characterization of the mixture hazard rate. 

Theorem 1 (James [15]). Let h be a random hazard rate as defined in 
(1), corresponding to model (9). Then, given Y ; the posterior distribution 
of h can be characterized as follows: 

(i) Given X and Y, the conditional distribution of fx coincides with the 
distribution of the random measure 

k 

(12) fi m '* + £ Ji5 x * = p m '* + A"'*, 

i=l 

where fi m '* is a CRM with intensity measure 

(13) v m '*(dv,dx) :=e- vKm( - x) p(dv\x)X(dx), 



continuity with corresponding jump Ji distributed as 

14 f*( dv ) = 7 „ -vK (x*) 73 r^T - 

J R +v ni e vAm[ *i>p(dv\X*) 

Moreover, the Ji's are, conditionally on X and Y, independent of jl m '* . 
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(ii) Conditionally on Y, the distribution of the latent variables X is 

tm , . . . ,,4iv) = nW?>iW(*.*5>A(*3> 

£fc=l En€/1 M n j= l k T n 3 (x) UieD d k( yi ,x)X(dx) 

for any k £ {l,...,n} and n := (ni,...,n fc ) G A k ^ n := {(m, . . . ,n k ) :nj > 



l)Ej=i^- =n}. 



3. Consistency. Our first goal consists in deriving sufficient conditions 
for weak consistency of the Bayesian nonparametric life-testing model (9) 
with mixture hazard (1), which covers also the case of data subject to right- 
censoring. Then, we exploit this criterion for obtaining consistency results 
for specific mixture hazards. 

In the case of complete data, a general and widely used sufficient condition 
for weak consistency with respect to a "true" unknown density function /o, 
due to Schwartz [33], requires a prior IT to assign positive probability to 
Kullback-Leibler neighborhoods of /o, that is, 

(15) U(fG¥:d KL (f ,f)<e)>0 for any e > 0, 

where dxLifo, f) = / log(/o(£)//(t))/o(*) dt denotes the Kullback-Leibler di- 
vergence between /o and /. 

In the presence of right-censoring, we do not actually observe the lifetime 
Y , but, (Z, A), where Z = Y A C, A = hy<c) f° r C a censoring time with 
distribution P c admitting density f c . Clearly, this leads us to consider a prior 
on the space F x F and the corresponding prior LT* induced on the space of 
the distribution of the observables (Zj,Aj)'s. 

The strategy of the proof consists in first rewriting the Kullback-Leibler 
condition in terms of the induced prior II* : this condition then guarantees 
consistency of II*. Moreover, it allows us to deduce the consistency of II, the 
prior on the distribution of the lifetime Y, under independent right-censoring 
with the simple support condition 

(16) supp(P c ) =R + . 

The last step consists in translating the Kullback-Leibler condition into a 
condition in terms of uniform neighborhoods of the true hazard rate ho on 
the interval (0,T] for any finite T. When dealing with models for hazard 
rates, the latter appears to be both more natural and easy to verify. 

Without risk of confusion, in the following we denote by IT the prior 
on / and also the prior induced on h. Moreover, recall that the "true" 
density /o can always be represented in terms of the "true" hazard ho as 
f (t) = ho(t) exp(- Jq h {s) ds). 
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Theorem 2. Let f be a random density function defined by (1) and 
(9) with kernels (2)-(5) and denote its (prior) distribution by II. Suppose 
the distribution of the censoring times P c is independent of the lifetime Y , 
absolutely continuous and satisfies (16). Moreover, assume that the following 
conditions hold: 

(i) fo(t) is strictly positive on (0, oo) and f R+ m&x{E[H(t)],t}fo(t)dt< 

oo; 

(ii) there exists r > such that liminf^o h{t)/t r = oo a.s. 

Then, a sufficient condition for II to be weakly consistent at /q is that 



for any finite T and positive 5. 

Some comments regarding the conditions are in order at this point. Let 
us start by condition (i): the strict positivity of /o on (0, oo) is equivalent 
to strict positivity of the "true" hazard ho on (0, oo), which is a property 
satisfied by any reasonable ho- The second part of condition (i), which is also 
related to the asymptotic characterizations considered in Section 4, clearly 
becomes more restrictive the faster the trend of the cumulative hazard. How- 
ever, note that if ho is a power function, then /o admits moments of any 
order and, hence, it is enough that the trend of the cumulative hazard is a 
power function as well. Condition (ii) allows to remove the somehow artifi- 
cial assumption of ^o(O) > as in [5]. Indeed, ho(0) = represents a common 
situation in practice and condition (ii) covers such a case by controlling the 
small time behavior of h. Obviously, if ho(0) > 0, then one would adopt a 
random hazard h nonvanishing in and so condition (ii) would be auto- 
matically satisfied. Overall, the result can be seen as a general consistency 
criterion for mixture hazard models and deals automatically with the case 
of independent right-censoring. Moreover, it should be extendable in a quite 
straightforward way to mixture hazards with different reasonably behaving 
kernels. 

Before entering a detailed analysis of specific models, we show how condi- 
tion (ii) of Theorem 2 can be reduced to the problem of studying the short 
time behavior of the CRM and, moreover, we establish that the CRMs 
defined in (7) satisfy the corresponding short time behavior requirement. 
Throughout this section we assume X = M + and, hence, when useful, fx will 
be treated as an increasing additive process (see [32]), namely the cadlag 
distribution function induced by fx. 

Proposition 3. Let h be a mixture hazard (1). Then condition (ii) in 
Theorem 2 is implied by: 



(17) 




10 



P. DE BLASI, G. PECCATI AND I. PRUNSTER 



(111) there exists e > such that h(t) > c/i((0,t]) for t < e, where c is a 
constant not depending on t; 

(112) there exists r > such that liminf^o P'((0,t])/t r = oo a.s. 

In particular, (iil) holds ifk is either the DL (2) or the OU (4) kernel; (ii2) 
holds if fl is a CRM belonging to (7) with a G (0, 1) and \{dx) = dx. 

Condition (iil) requires that the random hazard leaves the origin at least 
as fast as the driving CRM, which is typically the case. Out of the four 
considered kernels, we have to face the problem of h(0) = a.s. for the DL 
and OU mixtures and for both kernels (iil) is satisfied. Condition (ii2) asks 
to control the small time behavior of the CRM and is met by CRMs like (7). 
If one is interested in CRMs different from (7), one can try to adapt one of 
the several results on small time behavior known in the literature (see, e.g., 
[32] and references therein). 

We now move on to deriving explicit consistency results for mixture haz- 
ard life-testing models based on the four kernels defined in (2)-(5). These 
results are derived by verifying the conditions of Theorem 2 and, thus, hold 
also for data subject to right-censoring with absolutely continuous censoring 
distribution satisfying (16). Though the details of the proofs are different, 
they rely on a common strategy: first consistency is established via condition 
(17) for "true" hazards of mixture form ho(t) = f R+ k(t, x)/M)(dx), where k 
is the same kernel used for defining the specific model h; then, we show 
that these mixture /lo's are arbitrarily close in the uniform metric to any ho 
belonging to a class of hazards having a suitable qualitative feature. 

We first deal with DL mixture hazards h(t) = f-^ + h 0<x < t \fl(dx), which 
represent a model for nondecreasing hazard rates. The result establishes 
weak consistency of such models for any nondecreasing ho satisfying some 
mild additional conditions. 

Theorem 4. Let h be a mixture hazard (1) with DL kernel and ft sat- 
isfying condition (ii2) of Proposition 3. 

Then II is weakly consistent at any /o G J^i, where &\ is defined as the 
set of densities for which: (i) f R+ E[H(t)]f (t) dt < oo; (ii) h (0) = and 
ho(t) is strictly positive and nondecreasing for any t > 0. 

The second model we consider is represented by rectangular mixture haz- 
ards h(t) = / R + Int- X \<?\ji(dx). In order to obtain consistency with respect 
to a large class of /io' s we treat the bandwidth r as a hyper-parameter and 
assign to it an independent prior ir, whose support contains [0,L] for some 
L > 0. So we have two sources of randomness: f with distribution it and /2, 
whose distribution we denote by Q. Hence, the prior distribution II on h is 
induced by ir x Q via the map (r,fi) — ► h(-\r, /x) := / In._ x \ <T ^[j,(dx). In this 
framework we are able to derive consistency at essentially any bounded and 
nonvanishing Lipschitz hazard /jq- 
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Theorem 5. Let h be a mixture hazard (1) with rectangular kernel and 
random bandwidth f independent of ft,. Moreover, the support of the prior ir 
on f contains [0, L] for some L > 0. 

Then IT is weakly consistent at any /o G where ^2 is defined as the 
set of densities for which: (i) J K+ max{E[H(t)],t}fo(t) dt < 00; (ii) ho(t) > 
for any t > 0; (iii) ho is bounded and Lipschitz. 

Now consider OU mixture hazards h(t) = / K+ V2Ke~ K ^~ x H^ 0<x<t ^ft(dx). 
Define for any differentiable decreasing function g the local exponential de- 
cay rate as —g'{y)/g{y). Our result establishes consistency at essentially any 
ho which exhibits, in regions where it is decreasing, a local exponential de- 
cay rate smaller than kv2k. This sheds also some light on the role of the 
kernel-parameter k: choosing a large k leads to less smooth trajectories of h, 
but, on the other hand, ensures also consistency with respect to ho : s which 
have abrupt decays in certain regions. 

Theorem 6. Let h be a mixture hazard (1) with OU kernel and ft sat- 
isfying condition (ii2) of Proposition 3. 

Then II is weakly consistent at any fo £ ^3, where is defined as the 
set of densities for which: (i) f R+ m&x{E[H(t)],t}fo(t) dt < 00; (ii) ho(0) = 
and ho(t) > for any t > 0; (iii) ho is differentiable and, for any t>0 such 
that h' (t) < 0, the corresponding local exponential decay rate is smaller than 

Remark 1. In the above three mixture hazard models, one typically 
selects CRMs with A in (6) being the Lebesgue measure on M + . If this is 
the case, then condition (i) in the definition of ^ (i = 1,2,3), becomes 
f R+ t 2 fo(t) dt < 00 for DL mixture hazards and J R+ tfo(t) dt < 00 for rectan- 
gular and OU mixtures. 

Now we deal with mixture hazards based on an exponential kernel h(t) = 
f R+ x~ 1 e~ t / x ft(dx), which are used to model decreasing hazard rates. Note 
that, in contrast to the DL, rectangular and OU kernels which all exhibit, 
for any fixed t, finite support on M + when seen as functions of x, in this case 
the support is M + for any fixed t. This implies the need for quite different 
techniques for handling it. Recall that a function g on M + is completely 
monotone if it possesses derivatives g^ of all orders and (— l) n g( n \y) > 
for any y > 0. The next result shows that consistency holds at essentially 
any completely monotone hazard for which ho(0) < 00. 

Theorem 7. Let h be a mixture hazard (1) with exponential kernel such 
that h(0) < 00 a.s. 
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Then H is weakly consistent at any /o £ ^4, where is defined as the 
set of densities for which: (i) f R+ tfo(t) dt < 00; (ii) /io(0) < 00; (iii) ho is 
completely monotone. 

Note that the requirement of h not to explode in is easily achieved by 
selecting A in (6) such that J R+xIR+ (l — e~ ux v )p(dv\x)X(dx) < 00 for all 
u > 0, which is equivalent to h(0) < 00 a.s. [see (36) in Appendix A.l]. 

4. Fixed sample size posterior CLTs. In this section we derive CLTs for 
functionals of the random hazard given a fixed set of observations as time 
diverges. For the sake of clarity, in the following we confine ourselves to the 
case of complete observations; however, all subsequent results immediately 
carry over to the case of data subject to right-censoring. 

4.1. Further concepts and notation. Since we will heavily exploit the pos- 
terior characterization of h recalled in Theorem 1, it is useful to introduce 
first some definitions related to quantities involved in its statement. When- 
ever convenient, we shall use the notation v '* := v and jl '* := /x, that is, 
£iP'* is the "prior" CRM and u '* is its intensity measure. For every n > 0, 
q,p > 1, we denote by 

L p ((u n '*) q ) = L p ({R + x X) 9 , (^(R+) ® (u n '*) q ) 

the Banach space of real- valued functions / on (M + x X) 9 , such that \ f\ p is in- 
tegrate with respect to {v n '*) q := {v n '*)® q . We write L p ({v n '*) 1 ) = L p (v n '*), 
p> 1. The symbol Ls((z/ n '*) 2 ) is used to denote the Hilbert subspace of 
L 2 ((z/ n '*) 2 ) generated by the symmetric functions on (M + x X) 2 . Note that 
a function /, on (R+ x X) 2 , is said to be symmetric whenever f(s,x;t,y) = 
f(t, y; s, x) for every (s, x), (t, y) G M + x X. 

Now we introduce various kernels which will enter either the statements 
or the conditions of the posterior CLTs. For n > 0, we denote the posterior 
hazard rate and posterior cumulative hazard, given X and Y, by 

(18) h A n,*(t) = I k(t,x)[fl n '*(dx)+A n '*(dx)}=h n '*(t) + yj i k(t,X*) 

Jx fr[ 

(19) H A n,*(T)= [ T h A n,*(t)dt = H n >*(T) + y J t [ T k(t,X*)dt. 

Jo f^i Jo 

In (18) and (19), we implicitly introduced the notation h n '*(t) and H n '*(T) 
for, respectively, the hazard rate and cumulative hazard without fixed points 
of discontinuity. Note that h A o,* (t) coincides with h(t), the prior hazard rate. 
Furthermore, we need to define two basic classes of kernels: 



ASYMPTOTICS FOR POSTERIOR HAZARDS 



13 



(i) for every n > and every / G L 2 ((v n '*) 2 ), the kernel f*\ tTl f is defined 
on (1R + x X) 2 and is equal to the contraction 

(2°) f*i,nf( t l,xi;t 2 ,x 2 )= f(ti,x 1 ;s,y)f(s,y;t 2 ,X2)v n, *(ds,dy); 

Jr+xX 

(ii) for every n > and every / G L 2 ((i/ n '*) 2 ), the kernel /*2 n / i s defined 
on (M + x X) and is given by 

(21) f*2, n f(t,x)= f f(t,x;s,y) 2 v n >*(ds,dy). 

Jr+xX 

The "star" notation is rather common, see, for example, [16, 28, 34]. Note 
that the Cauchy-Schwarz inequality yields that /*i n / G L 2 ((u n '*) 2 ). It is 
worth noting that the two operators and "*2,m" which appear in 

the stataments of our CLTs, can be used to obtain explicit (combinatorial) 
expressions of the moments and of the cumulants associated with single and 
double integrals with respect to a Poisson (completely) random measure. 
See [31] for a discussion of this point. 

Introduce now a last set of kernels which will appear in the conditions of 
the results discussed in Section 4. Fix n > 0, take T such that < T < +oo 



and define 










(22) 




= sl k(t, x) dt, 
Jo 


(s,x)Gl 


>+x 


(23) 


kj> ^ (5, x] £, y) 


= -ff Q k(u,x)k(u, 


y)du- 




(24) 


krj-1 (^S j X^j 


s 2 f T 
= — / k(u,x) 2 du 
J Jo 






(25) 




= [ k?(s,x;u, 


w)v n '*(du. 


dw) 



Finally, for (s,x) G M + x X define the random kernel 



4 4 k,(^) = ^ / Hu,x) / k(u,y)A n >*(dy)du 
(26) ° 

k 
i=l 

4.2. General results. Before stating the results concerning the asymp- 
totic behavior of functionals of random hazards, we need to make some more 
technical assumptions, which do not appear to be very restrictive; indeed, 
in the following examples, involving kernels and CRMs commonly exploited 
in practice, they will be shown to hold. 
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In the sequel we consider mixture hazards (1) which, in addition to (Hl)- 
(H2), satisfy also 

k(t, x) j v j p(dv\x)\(dx) < +00 Vi, j = 1, 2, 4; 

- M+xX 

(H3) 

/ / k(t,x) j v j p(dv\x)X(dx)dt < +00 VT>0,j = 2,4. 

JO il+xX 

See [27, 28] for a discussion of these conditions. Recall from (18), that h n, *(t) 
stands for the posterior hazard without fixed points of discontinuity (given 
X and Y) and is characterized by (13). It is straightforward to see that, if 
the prior hazard rate satisfies (H1)-(H3), then h n '*(t) meets (H1)-(H3) as 
well. 

Given an event B G J^, we will say that B has P{-|X, ~Y}-probability 1 
whenever there exists 0' € & such that Pjfi'} = 1, and, for every fixed 
u £ £1' , the random probabilty measure Ah P{X £ j4|Y}(u;) has support 
contained in the set of those (x±, . . . , x n ) S X n such that 

P{S|X=(si,...,x„),Y} = l. 

Finally, fix a sample size n > 1 for the remainder of the section. The following 
Theorems 8, 9 and 10 provide sufficient conditions to have that linear and 
quadratic functionals associated with posterior random hazard rates verify 
a CLT. The first result deals with linear functionals. 



Theorem 8 (Linear functionals). Suppose: (i) k^ £ L 3 (u n '*) f 



or ev- 



ery T > 0; (ii) there exists a strictly positive (deterministic) function T\ 
Co(n,k,T) such that, as T ^ +00, 



(27) C^(n,k,T)x [k^'{s,x)]u n '*{ds,dx) -Kr$(n,k), 

JR+xX 

(28) C$(n,k,T)x f [k^ ) (s,x)fu n '*(ds,dx)^0, 

JK+xX 

where Oo(n,fc) E (0, +00). Also assume that, with P{-|X, Y} -probability 1, 

(29) lim Co(n, k, T) x J, f T k(t, X*) dt = m(n, A™'*, k) & [0, +00). 

T^+oo Jo 

Then, a.s.-P, for every real X, 

E[exp(iXC (n,k,T)[H(T)-E[H n <*(T)]})\Y] 



— ► E 

T->+oo 



A 2 

exp I iXm(n, A n '* , k) fo( n > k) 
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Remark 2. When n = and setting, by convention, Y = X = so that 
cr{Y,X} = {17,0}, one recovers Theorem 1 in [27] for prior random haz- 
ards. The same applies for the following two results concerning path-second 
moments and path-variances. 

Theorem 9 (Path-second moments). Suppose k)p n G L 2 (v n '*)nL l (v n >*), 

4^ G L 3 (z^ n '*) and that there exists a strictly positive function Ci(n,k,T) 
such that the following asymptotic conditions are satisfied as T — > +oo; 

1. 2C 1 2 (n,fc,T)||4 1) ||| 2(( ^ )2) -^a 2 (n,k) G (0,+oo); 

2. Cf(n, k, T) || k T lll/i^n,*^) ^0; 

3. Cf(n,k,T)\\k T *\ n 4 || 2 2((j,n,*)2) ~~ * 0/ 

^4/^ U r T\ II tSX) . 1 J.WII2 



4. C-^n, A;,T)||A; r * 2 ,n^T IIl 2 (i/™.*) ~~ * ^ 



5. C 2 (n,fc,T)||4 2) + 24 3 L + 2 4 4) A"'*lli 2 (^*) -^04(n ) A n >*,fc) G |0,+oo), u»<fc 



P{-|X,Y}-pro6a6i/% i; 

Oi-( 4 ) 113 



6. C 1 3 (n,A;,r)||4 2) + 24 3 L + 2 4 4) A™.*ll!3( I ,".*) ^ °> withP{-\X,Y}-probability 

i; 

7. with P{-|X, Y} -probability 1, 

Cl(n r fc ' T) ((E^Pi)) dt->«(n,A»*AO€[0,+oo) 
Moreover, define 



(30) ^*:=£-f/ nh n '*(t)]k(t,X*)dt 

Then, a.s.-P, for every real X, 

1 /-T 1 /-T 



E 



exp (^iXCi (n, /c, T) | ^ ^ ~ h (tf dt - A n / - ^ E[h n '*(t) 2 ] dty 



E 



2 



i\v(A n '*,k) - y (a 2 (n, /c) + a 2 (n, A n '*, jfc)) 



Theorem 10 (Path- variances). Suppose that the assumptions of Theo- 
rem 8 and Theorem 9 are satisfied. Assume, moreover, that 

1. C x {n,k,T)/{TC Q {n,k,T)) 2 ^Q; 

2. 2Ci(n,A;,r)E[ J H" n '*(r)]/(r 2 Co(n,A;,r))^5(n,A;) G M; 

3. ||Ci(n,A:,T)(4 2) + 2^?L + 24%.*) ~ ^k)C (n,k,T)kP\\ 2 L2{iyn , t) 
cj 2 (n, A n '*,fc) G [0,+oo), loito P{-|X, Y} -probability 1 
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and A T ' is given by (30). Then, a.s.-P, for every real A 

E [ e iXDl{n,k,T){l/T ^[h{t)-H(T)/Tfdt-A n T *~l/T %[h n -*(tf] dt+E[H™<* (T)] 2 /T 2 } | y j 

Remark 3. We stress that, in general, the four quantities m(n, A n '* , k), 
o"|(n, A n '*, k), v(n, A n, *,k) and cr 2 (ra, A n '*, fc) (appearing in the previous 
three statements) can be random. 

4.3. Applications. In this section we derive CLTs for functionals of pos- 
terior hazards based on the four kernels (2)-(5), combined with generalized 
gamma CRMs [2], namely CRMs as in (7) with 7 a positive constant. The 
measure A is chosen such that the life-testing model is well defined and (Hl)- 
(H3) are met. Many other classes of CRM represent possible alternatives and 
one can proceed as below. It is important to recall that consistency of all 
the models dealt with below is easily deduced from the results in Section 3. 

In all the cases we get to the conclusion that the asymptotic behavior of 
functionals of the posterior hazard rate coincides exactly with the behavior of 
functionals of the prior hazard. To see why this happens, let us focus on the 
behavior of the trend of the posterior CRM. It turns out that E,[H n '* (T)] ~ 
ipi(T) + ip 2 (T; Y), where ipi(T) ~ E[H(T)] and ip2(T;Y) explicitly depends 
on the data Y, is different from for every T > and ip2(T; Y) = o{ip\{T)). 
Moreover, once the rate of divergence from the trend Cq(u, k, T) is computed, 
one finds that C (n, k,T) = C (k;T) and C (A;,T) _1 x ^ 2 (T; Y) -> as T -> 
00. To fix ideas, consider a DL mixture hazard with generalized gamma 
CRM given one observation Y\\ one obtains 



2 



E[F 1 '*(T)] = ^^-T 



Y 1 r y i 1 



■ alx 



+ 0(1) 



_7 1 ~ CT Jo (Yi -x + ~i) l - a 

and, since the divergence rate Co(n, k, T)" 1 is equal to T 3//2 , the influence of 
the data vanishes at a rate T _1//2 . Similar phenomena occur when studying 
the asymptotic behavior of the part of the posterior corresponding to the 
fixed points of discontinuity. This basically explains why the forthcoming 
CLTs do not depend on the data. Such an outcome is quite surprising, at 
least to us. Note, indeed, that the Poisson intensity of the posterior CRM 
(13) depends explicitly on the data Y, which implies that the posterior haz- 
ard, and a fortiori the posterior cumulative hazard, depend on the data for 
any T. Also, the fact that the variance of the asymptotic Gaussian distri- 
bution is not influenced by the data is somehow counterintuitive: since the 
contribution of the CRM vanishes in the limit, one would expect the vari- 
ance to become smaller and smaller as more data come in. Since this does 
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not happen, our findings provide some evidence that the choice of the CRM 
really matters whatever the size of the dataset. Hence, one should carefully 
select the kernel and CRM so to incorporate prior knowledge appropriately 
into the model; the neat CLTs presented here provide a guideline in this 
respect by highlighting trend, oscillation around the trend and asymptotic 
variance. 



4.3.1. Asymptotics for kernels with finite support. We start by consider- 
ing kernels with finite support, namely, the DL, OU and rectangular ones 
with generalized gamma CRM and take A to be the Lebesgue measure on 
M + . This ensures that (H1)-(H3) are satisfied. For a generalized gamma 

CRM one has, for any c> 0, J °° s c p{ds) = [(1 - (j) c _i] 7 ^ c+fT := K ( p c) , where 
(a) n := r(o + n)/r(a) denotes the Pochhammer symbol. Since in the pos- 
terior the CRM becomes nonhomogeneous with updated intensity (13), the 
verification of the conditions of Theorems 8-10 can become cumbersome. 
However, for any A G R^_, one has 

(31) V{A) < v n '*{A) < v{A), 

where v(dv, dx) := exp{— nky ^ (v, x)}u(dv, dx) and Y( n ) stands for the largest 

lifetime. Having a lower and an upper bound for the Poisson intensity v n '* 
allows then to use, conditionally on X, Y, a comparison result analogous to 
Theorem 4 of [27] in order to check the conditions of the posterior CLTs. 

Let us first consider linear functionals for the OU kernel. Note that 
kP(v,x) = v^/2[k{1 - e~ K( - T ~ x) )\ <x<T), and that k^ £ L 3 (i/), so that 
condition (i) of Theorem 8 is a direct consequence of (31). Next, one can 

check that Il|, 2 (i7) ~ II lli 2 (i/) ^^ K ^ 1 K^T. In fact, the dominating 
term in the norm with respect to V is the integral over M + x [YL),oo), 
which is in turn equal to the dominating term of ||^^lli 2 (y)- Moreover, 
T~ 3 / 2 ||/4? ^ Hfsjyn,*) ^0 and we have that (27) and (28) are satisfied with 

C Q (n,k,T) = C (0,k,T) = 1/VT and a^(n,k) = a"i(0,k) =2k~ 1 kP , which 
importantly does not depend on the observations Y. As for (29), with 

P{-|X, Y}-probability 1, we have Ya=i kP{Ji,X*) = 0{T~ l ) as T -> oo, so 
that (29) holds with m(n, A n '*,k) = 0, not depending on X, Y. Finally, since 

||4 0) || L i {77) ~ \\kP\\ L i [u) ~KPy/2/KT, then E[H n >*{T)]~E[H(T)]. Hence, 
from Theorem 8 combined with fact that the limiting mean does not depend 
on Y, it follows that 

[H(T)~ v/27^ 7 - 1+CT T^ 1 ' ^ 2 



(32) E 



exp iX 



ex p(-ycro(°>/c)) 



T l/2 

where <7q(0, k) = 2k _1 (1 — a)j~ 2+a . Therefore, the posterior cumulative haz- 
ard has the same asymptotic behavior as the prior cumulative hazard. As 
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mentioned before, this is quite surprising also in the light of the consistency 
result. 

Let us now consider the path-second moment. We obtain (v,x;u,y) = 



uv_ e K(x+y) ^ e -2K(xVy) 



-2kT 



rem 9, one finds that II h 



(1) 112 



0< x,y<T) i 



\k 



(1)[|2 



e 

k T llL2(F 2 ) 

Ci(n,k,T) = VT and a 2 (n,k) = 2n~ 1 {Kp i ' 1 ) 2 , which coincide with the case 
n = 0. The idea here is the same as before, namely that the dominating 
term of the norm with respect to V 2 is the integral over (M + x \Vt n \, oo]) 2 , 

which is equal to the dominating term of \\k^ W^r^y Then, conditions 2., 3. 
and 4. are verified since they are verified for n = 0. In particular, note that 

for i = 1,2. As for condition 5., one first check 



and, as for condition 1 in Theo- 



T llL 2 (i/ 2 ) 
(2)n2 



2n- l (Kf ) ) 2 T and, hence, 



\ rj 1 

that k, 



H,0 "T 



(1) 



V T A n '* ~ 0(T 1 ), then some tedious algebra allows to verify that it 
is satisfied with aj(n, A n <*,k) = K ( p A) + ^K^K^ + ^kP(K^) 2 . This is, 

(3) 

indeed, a delicate point since both kq,' and the norm with respect to the 
updated Poisson intensity v n >* depend on the posterior. Once this is done, it 
is not difficult to check that condition 6. is satisfied. Moreover, the quantity 
v(n,A n, *,k) in condition 7. can be shown to be 0, whereas A^p* = 0(T _1 ) 
in (30). Finally, one can check that ± E[h n <* (t) 2 } dt ~ ± E[h°<* (t) 2 } dt ~ 

K^f 1 + ^(Kp 1 ^) 2 , so that, from Theorem 9, we deduce the following CLT for 
the path-second moment: 

E [exp (iXVfl i [ T h(tf dt - 7" 2+<T ( 1 - a + — 
(33) L V lT7 ° 



exp (-y (a 2 (n, fc) + a 2 (n, A n '*,k))j , 
where cr 2 (n, /c) + cr 2 (n, A"'* , k) 



,2f^ l\ i ^2/ 



(l-o-)(16B- 1 7 2 ' 7 +2(9-5g-)7' T +K(2-o-)2) 
K7 4 " CT ' 

As far as the path-variance is concerned, one verifies easily that the condi- 
tions of Theorem 10 are satisfied, with 5(n, k) = ^^-Kp 1 ^ and a 2 (n, A n '*, k) = 



(i) 



mean of the path- variance, one finds that ^ Jq K\h n '* (t) 



p , which again do not depend on the observations Y. As for the posterior 

T„ t ln.*u\ E[H"-*(T)} ]2 
T 



■dt 



Kf ] + o(T" 1 / 2 ), so that Theorem 10 leads to 



-1/2 



E 



(34) 



exp iX 



Hi) 



H(T) 



dt 



(7 



7 



2-cr 



exp (-y (a 2 (n, k) + a 2 (n, A n '*,k))j , 
where af (n, k) + o\ (n, A n '* , k) 



K7 4 ~ CT ' 
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For the other two kernels, namely rectangular and DL, one can proceed 
along the same lines of reasoning and, again, the asymptotic posterior be- 
havior coincides with the one of the prior. In particular, one obtains that for 
linear functionals and quadratical functionals of hazard rates based on the 
rectangular kernel, the CLTs (32), (33) and (34) hold with the same rate 
functions and appropriately modified constants and variances (for the exact 
values see [27], since they coincide with the a priori ones). As for the DL 
kernel the CLT for the posterior cumulative hazard is of the form 



With reference to quadratic functionals, in this case, some of the conditions 
of Theorems 9 and 10 are violated already in prior (see [27] for details). 

4.3.2. Asymptotics for exponential kernel. Here we consider random haz- 
ards based on the exponential kernel. Indeed, it is crucial to consider also 
a kernel with full support, since one may think that the lack of dependence 
on the data of posterior functionals may be due to the boundedness of the 
support of the kernels dealt with in Section 4.3.1. However, it turns out that, 
again, the posterior CLTs coincide with the corresponding prior CLTs. 

In particular, set, within (7), \(dx) = x~ l l 2 e~ l l x (2y / 7r)~ 1 : this implies 
that h(0) < oo a.s., (8) is in order and (H1)-(H3) are satisfied. This model 
is of interest also beyond the scope of the present asymptotic analysis; in 
fact, it leads to a prior mean E[/i(i)] = K^\t + l)" 1 / 2 and, thus, we have a 
nonparametric prior centered on a quasi Weibull hazard, which is a desirable 
feature in survival analysis. 

We start by investigating the linear functional of h: here we provide details 
also for the derivation of the prior CLTs since this model has not been 
considered in [27]. In this case, we have that fcj? (v,x) = v(l — e~ T ^ x ) and 
k^ (v,x) G L 3 (v) for all T > and the same holds for the posterior. We also 



have that \\kf' \\ L i = Kf\y/T+T - 1), so that, as T -» oo, E[H(T)] ~ 
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Therefore, K[H n '* (T)] ~ ~K[H(T)]. Similar arguments lead to show that 
||4 0) |ll2 (F) ~ H^llia^) ~ (2-\/2)^ 2) VTand, hence, we have C {n,k,T) = 
C (0,fc,T) = T -1 / 4 and o%(n,k) = o%(0,k) = (2 - \f2)K { p ] . Moreover, 
H^T^Ilis^) ~ 0(y/T) is sufficient for concluding that (28) holds both for the 

prior and the posterior. Finally, as T — > oo, Ya=i ( *^i> = 0(1) with 
P{-|X, Y}-probability 1; thus, also in this case (29) holds with m(n, A™'*, k) = 
0. We can then deduce from Theorem 8 that 



E 



exp iX 



[f(t)- 7 -( 1 ^)t 1 /2] 
rV4 



2 



r ^ oo expf- y ^(0,fc) 



for any sample size n > and with cjq = (2 — \/2)(l — a) / y~ 1+cr . Hence, we 
have shown that the exponential kernel hazard exhibits both trend and 
oscillations of order T 1 / 2 and verifies exactly the same CLT for both prior 
and posterior cumulative hazard, thus confirming that the asymptotics is 
not influenced by the data. 

Our results for quadratic functionals do not apply to the exponential 
kernel. To see this, note that ktp (v, x; u, y) = ^^(1 - exp{-^T}) and, 
by calculating the norm with respect to v 2 , we get 

ll^ll 2 (4 2) ) 2 

Ft i 6 (2T2+3T + 1)' 

which implies Ci(0, k, T) = T. However, 11^4(^2) ~ d being a positive 
constant, so that condition 2 in Theorem 9 does not hold. 



5. Concluding remarks. In the present paper we have investigated two 
different asymptotic aspects of a random hazard model, namely consistency 
and the behavior of a functionals of the hazard as time diverges. As for 
the former, we have provided a general weak consistency criterion for mix- 
ture random hazards and established weak consistency for specific models 
with respect to large classes of "true hazards" ho. It seems worth discussing 
briefly the case of Weibull hazards, that is, ho(t) = a\t a ~ l (a / \){t / \) a ~ l 
with a, A > 0, which are widely used in the parametric setup. The case 
of a > 1 is covered by both Theorem 4 (DL kernel) and Theorem 6 (OU 
kernel). When a < 1, ho is a completely monotone function and it would 
naturally belong to the domain of attraction of Theorem 7; however, in 
such ho(0) is not finite and, hence, the required conditions are not 

met. Nonetheless, ho can be approximated to any order of accuracy by 
h £ (t) = (a/X)((t + e)/A) a ~ 1 , for some small enough e > 0, when accuracy 
is measured in terms of survival functions. In fact, it is easy to see that for 
So{t) and S £ (t), the survival functions corresponding to ho and h £ , respec- 
tively, sup t |So(£) — S e (t)\ goes to zero as e approaches zero. Finally, note 
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that Theorem 7 applies to h e for any e > 0. Further work is needed in or- 
der to extend the consistency result to completely monotone hazards which 
explode in zero; for such cases, condition (17) is probably too strong. 

Future work will also focus on achieving consistency with respect to 
stronger topologies; two are the possible routes in this direction. The first 
one is to investigate under which additional conditions on the CRM \x and 
restrictions on the form of the true hazard rate ho we get Li-consistency 
at the density level, that is, (10) with A £ being a L\ neighborhood of /o- 
To this end, one has then to consider the metric entropy of the subset of F 
corresponding to the qualitative condition given on Jiq. Moreover, one has 
to investigate in detail the support of the prior II on F via the mapping 
h—>f = hexp(— Jq h). This appears to be a rather difficult problem because 
of h appearing twice, and existing results on random mixing densities are 
not easily extensible. The second strategy consists of investigating consis- 
tency directly at the hazard level. Indeed, weak consistency at the density 
level implies pointwise consistency of the cumulative hazard: 



n n U: 



h(t)dt- / h (t)dt 



o 



< e \ -> 1 a.s.-P °° 



for any e, T > 0. Among stronger topologies, a promising one seems to be the 
one induced by Jq 00 \h(t) — ho(t)\So(t) dt, where So(t) = exp{— Jq ho(s) ds}. 

With reference to the study of the asymptotic behavior of functionals of 
the random hazard, a further interesting development consists in studying 
the joint limit as both the number of observations and time diverge. To 
achieve such a result, one probably needs to find a right balance in the 
simultaneous divergence of the sample size and time, which lets the influence 
of the data emerge. 



APPENDIX: BACKGROUND, ANCILLARY RESULTS AND PROOFS 

A.l. Completely random measures. Here we highlight some basic facts 
on CRMs. The reader is referred to [3] and [22] for exhaustive accounts. Con- 
sider a measure space (X, ^T), where X is a complete and separable metric 
space and 2£ is the usual Borel c-field. Introduce a Poisson random measure 
N, defined on some probability space (f2, J£",P) and taking values in the set 
of nonnegative counting measures on (M + x X,^(R + ) & S£\ with intensity 
measure u, that is, ~E[N(dv, dx)] = u(dv,dx) and, for any A € (g> X 

such that y{A) < oo, N(A) is a Poisson random variable of parameter 
v(A). Given any finite collection of pairwise disjoint sets, A\, . . . ,A^, in 
^(1R + ) <S> 3£ , the random variables N(A\), . . . , N^Af-) are mutually indepen- 
dent. Moreover, the intensity measure v must satisfy J R+ (v A l)u(dv, X) < oo 
where a A b = min{a, b}. 
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Let now (M, 3§(M)) be the space of boundedly finite measures on (X, 
where p is said boundedly finite if p(A) < +00 for every bounded measur- 
able set A. We suppose that M is equipped with the topology of vague 
convergence and that BS{M) is the corresponding Borel u-field. Let p be a 
random element, defined on (0,^,P) and with values in (M, ^(M)), and 
suppose that p can be represented as a linear functional of the Poisson 
random measure N (with intensity u) as p(B) = f R+xB sN(ds,dx) for any 

B £ X . From the properties of N it easily follows that jl is a CRM on X 
[21], that is: (i) (1(0) = a.s.-P; (ii) for any collection of disjoint sets in 3£ , 
B±,B2, . . • , the random variables jl(Bi),jl(B2), ■ ■ ■ are mutually independent 
and /i(Uj>i Bj) = J2j>i AC%) holds true a.s.-P. 

Now let 5f„ be the space of functions ^:X— >W + such that J M+xX [l — 
e~ S9 ^]u(ds,dx) < 00. Then, the law of jl is uniquely characterized by its 
Laplace functional which, for any g in is given by 

(35) E[e-y {x) ^ dx) }=exJ- [ [1 - e- S9 ^]u(ds,dx)\ . 

From (35) it is apparent that the law of the CRM jl is completely determined 
by the corresponding intensity measure v. Letting A be a cr-finite measure 
on X, we can always write the Poisson intensity v as (6), where p:&(M + ) x 
X — > M + is a kernel [i.e., x 1— ► p(C\x) is ^-measurable for any C £ ^(P + ) 
and p(-\x) is a a-finite measure on ^(Pv + ) for any x in X]. Note that the 
kernel p(dv\x) is uniquely determined outside some set of A-measure 0, and 
that such a disintegration is guaranteed by Theorem 15.3.3 in [17]. Finally, 
recall (see, e.g., Proposition 1 in [30]) that a linear functional of a CRM, 
fvf(x)jl(dx), is a.s. finite if and only if 

(36) / [l-e- u \ f( - x) \ v ]p(dv\x)\(dx) <+oo Vn > 0. 

JM+xX 

A.2. Proofs of the results of Section 3. 

Proof of Theorem 2. The first step consists in adapting the K-L condi- 
tion (15) to the case of right-censoring. Denote by Fo C F x F the class of all 
pairs of density functions (f\, fa) such that both fa and fa are supported on 
the entire positive real line. Let ~ /j, for i = 1, 2, suppose X± is stochas- 
tically independent of X2 and define ip(Xi,X2) = (X\ A X2,I(x 1 <x 2 ))- The 
density of ip with respect to the Lebesgue measure and the counting measure 
on {0, 1} is given by 

/*oo /*oo 

<t>(fa,fa)(z,l) = fa(z) fa(x)dx, <H/i,/ 2 )M)= / h{x)dxfa(z). 

Then cj) is one-to-one on Fo and the maps <fi, <j)~ l defined on Fo and Fq = 
<P(¥q), respectively, are continuous with respect to the supremum distance 
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on distribution functions. See Peterson [29]. Denote by ft the prior on Fo and 
by II* = tl o the induced prior on Fq. Since (fo, / c ) 6 Fo by hypothesis, 
the continuity of <j)~ l implies that the posterior n(-|(^i, Ai), . . . , (Z n , A n )) is 
weakly consistent at (fo, f c ) if n*(-|(Zi, Ai), . . . , (Z n , A n )) is weakly consis- 
tent at (j)(fo, fc)- Indicate by p(x, d), for x € K and d = 0, 1 a generic element 
of Fq. Then, K-L support condition on II* at po € Fq takes the form 

tt*J f°° i iM MM) , , r 00 , n v, poM) \ 

n^:^ ^,1)108-^^ + ^ p M)log^-^< £ j>0 

for any e > 0. As observed in Section 2, since the prior on f c does not play 
any role in the analysis, we may treat f c as fixed, that is, take a prior on 
F x F of the form II x Sf c . Hence, by setting po(x,d) = <p(fo, f c )(z,d), the 
K-L condition boils down to 

(37) nj/: | o °° f ( t )S c (t)log^dt + | o °°5o(t)/ c (t)log^||dt<e|>0 

for any e > 0, where we defined the survival functions So(t) = 1 — f£° fo(x) dx, 
Sf(t) = 1 - / t °° f(x) dx and S c (t) = 1 - / t °° / c (x) dx. 

The next step consists in showing that, under the stated hypotheses, 
the K-L support condition (37) is satisfied, which in turn implies weak 
consistency. Specifically, we show that a sufficient condition for (37) is that, 
for any S > 0, there exists T' such that, for any T >T' , 

(38) u\h: S np\h(t)-h (t)\<5, f° \H - H \f < s) > 0, 

I t<T JT ) 

where H(t) = /q /i(s) (is and Ho(t) = Jq ho(s) ds. By the structural properties 
of the model with (2)-(5), it follows that (38) holds under condition (17) 
and Jq°° \H(t) — Ho(t)\fo(t) dt < oo a.s. In particular, the latter is implied by 
condition (i) and the fact that J* °° Ho(t)fo(t) < oo. 
Define the set 

(39) V(8,T):=\h:swp\h(t)-h (t)\<6, [°° \H - H \f < 5}, 

which, by (38), has positive probability for any S and any T larger than a 
time point T 1 that may depend on 8. Our goal is then to show that, for 
any e > 0, there exists 5 > and T sufficiently large such that, for any 
heV(5,T), 

/•OO 

(40) j T log(/o//)/oS c + log( < S /5 / )/ c 5o<e/2, 

(41) f T log(/ //)/ S c + log(S /S f )f c S < e/2, 
Jo 
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where f(t) 



(42) 



T 



log 



< 



h(t) exp(— /q h(s) ds) . Let us start from (40) by noting that 



/ 



/o5 c + log 



,5' 



log(h )f S c 



T 



fcSo 



log(/i)/ 5 c + / \H - H \(f S c + f c S ). 
Jt 



As for the first integral in the right-hand side of (42), it is easy to see 
that f£? log(/io)/o<S'e goes to zero as T — ► oo. As for the second integral, one 
needs to consider the case of h(t) that eventually goes to zero, but then the 
negligibility of the integral as T — > oo is guaranteed by condition (i) and (8), 
which is needed for the model to be well defined. As for the third integral 
in the right-hand side of (42), notice that f (t)S c (t) + f c (t)S (t) < 2f (t) 
for t sufficiently large since S c < 1 and f c is eventually smaller than ho. 
Therefore /£° \H — Ho\(foS c + f c So) < 25 and we can conclude that there 
exists a positive 5 sufficiently smaller than e/4 and T sufficiently large such 
that (40) holds for any h £ V{5,T). 

We are now left to show that (41) holds. Assume first that ho(0) > and 
write 

T log(/ //)/ 5 c + log(S /S f )f c S 



Jo 



(43) 



log 



hpjt) 
h(t) 



fo(t)S c (t)dt 



+ 



T rt 



.'o 



[h(s) - hois)] ds[f (t)S c (t) + f c (t)S (t)] dt := h + h 



Next, let c := inf t <r ho(t), which is positive by condition (i), and note that, 
for 5<c and he V(5,T), 

rT \ho(t) . „ ,. , f T 5 „ , <5 







h(t) 



1 



fo(t)S c (t)dt< 







:fo(t)dt< 



t r 



sup \h(s) 

s<t 



ho(s)\ 



t[f (t)S c (t) + f c (t)So(t)]dt 



<5 



t[fo(t)S c (t) + f c (t)S (t)}dt <5E , 



where Eq := J °°tfo(t)dt is finite by condition (i) and the last inequality 
follows from foS c + / c 5o being the density of Z = Y A C which, in turn, 
is stochastically smaller than Y . Hence, I\ + 12 < 6(c — 5) _1 + 5Eq, so that 
5 < min{ce/(4 + e),e/(4£'o)} implies (41) for any h S V(5,T), no matter how 
large T is. Finally, one can choose 5 small enough and T large enough such 
that (40) and (41) are simultaneously satisfied for any h E V(S,T). 
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By allowing ho(0) = 0, we need a different bound for I\ in (43). We proceed 
by taking < q < T and split I\ into 

h= |Jlog(^)/ (t)5 c (t)^ + ^ T log(^)/o(i)5 c (t)^:=/n+/i 2 . 

As for ii2, for fixed e, find 5 and T such that h G y(5, T) implies I\2 + 12 < 
e/4, for any As for In, we need to prove that, for the same e fixed above, 
there exists a small enough <; > such that 

(44) riog(h (t)/h(t))f (t)S c (t)dt < e/4. 

Jo 

This is tantamount of showing that log(/io/fr)/o<Sc is integrable in a.s., 
which in turn reduces to show that log(ho/h)fo is integrable in a.s. since 
S c (0) = 1. Note that it is sufficient to control the worst case, namely when 
/i(0) =0 a.s., but then integrability in follows from condition (ii). Indeed, 
we need to show that there exists < p < 1 such that 

TiO ^ 1 

First note that lim r j log{/io(T)}/o(r) = 0. This can be deduced by reasoning 
in terms of log(/ )/o since, clearly, h (r) ~ / (r) as r —> 0. As for log(/ )/ 
vanishing at zero, we start considering /o having regular variation of expo- 
nent < p < 1 at zero, that is, /o(t) ~ t p L(1/t) as r — > 0, for L(-) a slowly 
varying function at 00. Recall that a positive function L(x) defined on IR + 
varies slowly at 00 if, for every fixed x, L(tx) / L(x) — > 1 as t — > 00. Hence, 

Mr) log[/ (r)] ~ T*{log(rP) + Iog[L(l/r)]} := ^L*(l/r), 

where L* is a slowly varying function at cxo. Hence /olog(/o) is a regularly 
varying function at zero with exponent p and, in turn, it vanishes in zero. 
Note that the larger p is, the faster log{/io(r)}/o(r) vanishes as r — > 0. Next, 
we have that, for any < p < 1, 

v log{ho(r)/h(T)}f (T) -log{Mr)} ... -log{r r } 
hm sup z = + hm sup = < hm = , 

rlO TP- 1 ri0 F TP' 1 -TiO TP' 1 

where the last limit is zero for any < p < 1 . The integrability then follows 
for any < p < 1. Slightly different arguments can be used when /o has 
regular variation of exponent p > 1 at zero, while the special case of /o 
slowly varying at zero (i.e., p = 0) can be dealt by using Lemma 2 of Feller 
[7], Section VII. 8. The proof is then complete. 
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Proof of Proposition 3. The fact that (iil) and (ii2) are sufficient for 
condition (ii)(b) of Theorem 2 to hold is straightforward. 

Since for DL mixture hazards h(t) = fi([0,t]) and for OU mixtures h(t) > 
\/2Ke~ K£ fl([0, t]) for any e > t, condition (iil) is met for both. 

Let us now show that CRMs as in (7) with a G (0, 1) and \{dx) = dx 
satisfy condition (ii2). Assume, for the moment, that 7 in (7) is constant 
and denote it by 7. Hence, we have the generalized gamma subordinator, 
whose Laplace exponent is given by ip(u) := o" _1 (n + 7) " — 7 ". Moreover, 
/q°° v e p(dv) = 00 for any e < cr and the inverse of ip(u) is of the form V' _1 (y) = 
(ay + 7 CT ) 1 /^ _ 7 . Thus, we are in a position to apply Proposition 47.18 in 
[32], which, in our case allow to state that there exists a constant G such 
that 

(45) liminf ^Mj = G a.s. with < C < 00, 

W g{t) 

where g(t) = loglog(l/t)[(cjr 1 loglog(l/i) - 7] -1 . From (45) it fol- 

lows immediately that, for any 5 > 0, liminf^o ^ifl'+l = 00 a - s - Hence, condi- 
tion (ii2) is satisfied by taking r = 1/a + S. To see that condition (ii2) holds 
also if jl is a nonhomogeneous CRM it is enough to note that the correspond- 
ing Laplace exponent <r _1 J^°[(u + j{x)) a — ^y(x) cr ] dx is bounded above by 
ip(u) := cr~ 1 (u + j) a — 7 " with 7 = inf xgR + j(x) > and that, infinitesimally, 
a nonhomogeneous CRM behaves like a homogeneous one. 

An auxiliary lemma. Before getting into the proofs of the consistency 
results, we provide a useful auxiliary result. Let M be the space of boundedly 
finite measures on ~R + and denote by G the space of distribution function 
associated to it: clearly, any GgG will be a nondecreasing cadlag function 
on R + such that G(0) = 0. 

Lemma 11. Let fi be a CRM on M + , satisfying (HI), and denote by 
Q the distribution induced on G. Then, for any Go G G, any finite M and 
rj>0, 

q(ggG: sup \G(x)-G (x)\ <r]\ > 0. 

I x<M J 

Proof. Fix e > and choose (zq, . . . , z N ) such that (i) = zq < z\ < 
■ ■ ■ < z N = M; (ii) all locations, where Go has a jump of size larger than e/2, 
are contained in (z±, ... , z N _ 1 ); (iii) for I = 1, . . . , N, Go(z^) — Go(-zz-i) < e. 
Next, define 

TV 

(46) G e {x)=Y^j l \ Zl <^ 

1=1 
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where the jump j l at z\ is given by j t = G {zi} + G (zi~) - G (zi-i), for 
I = 1, . . . ,N. If Z\ = 0, then set by convention Gq{zq) := Go(CT) = and 
G e (zo) ■= G e (0~) = 0. By construction G e (x) < Gq{x) for any x < M and 
sup x<M [Go(x) — G £ (x)\ < e. Under (HI), it can be proved that, for any x < 
M and for any a, b such that < a < b, Q{G G G : G(x) G (a, 6)} > 0. See, for 
example, Proposition 1 in [5]. Given this, we next show that G e in (46) is in 
the support of Q. Fix 6 > and denote by Bg(c) the ball of radius 5 centered 
in c. Define W,(G 6 ) = {G G G : G(z,) - G(2 W ) G Bg/^N) [G e (zi) - G £ ( Zl ^)]} 
for / = 1, . . . ,N, with the convention that G(zq) := G(0 — ) = if z\ = so 
that G(zi) - G{z ) = G{0}. Then flzli W t (G £ ) C {G G Grsup^ \G(x) - 
G £ (x)\ < 5}. The sets Wi(G s ) are independent under Q and each has posi- 
tive probability. We conclude that, for any 5 > 0, Q{G G G : sup x<M \ G(x) — 
G £ (x)\ < 5} > Q{f)iLi Wi} > 0. The proof is then completed by taking e and 
5 such that e + 5 < rj. □ 

Now, relying on Theorem 2 and Lemma 11, we are in a position to provide 
the proofs of Theorems 4-7. Showing that for the specific kernels at issue 
(17) is met, represents a result of independent interest concerning small ball 
probabilities of mixtures with respect to CRMs; indeed, passing through 
Lemma 11, we actually show that (HI) is sufficient for (17), that is, for h 
putting positive probability on uniform neighborhoods of ho. 

Proof of Theorem 4- The first step consists in verifying consistency with 
respect to hazards of mixture form. To this end we postulate the existence 
of a boundedly finite measure on such that 

(47) ho(t)= / k(t,x)no(dsc). 

JM.+ 

Clearly, has to be such that Jq ho(t) dt — > +oo, as T — > oo, in order to 
ensure the model to be properly defined. In the case of the DL kernel, (17) 
is a direct consequence of Lemma 11 since ho(t) = Go(t) and h(t) = p,([0,i\). 

The consistency result clearly extends to all increasing hazard rates ho 
with ho(0) = 0. To see this let no be the measure associated to ho- Then 

£ M since /x((0, r]) = ho(r) — ► as r — > and ho(t) = JI^xK^^oidx). 
Finally, note that the moment condition in (i) of Theorem 2 reduces to 
f m+ K[H(t)]fo(t) dt < oo since, for any choice of A in (6) and for any large 
enough t, E[H(t)] > t. 

Proof of Theorem 5. As before, we first establish (17) for ho of mixture 
form (47) and assume r to be fixed and (i) and (ii) to hold. Take G G {G G 
G : sup t<T+r \G(x) — Gq(x)\ < 5} and let ha be the corresponding hazard 
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rate. Then, one has 
swp\h G (t) - h (t)\ 

t<T 



sup 

t<T 



rt+T rt+T 

/ dG(x) - / dG (x) 

J(t-r)+ J(t-r) + 



= sup|G(t + r) - G((t - r)+) - G (t + r) + G ((t - r)+)\ 

t<T 

< sup \G(t + r) - G (t + r)| + sup|G((t - r)+) - G ((t - r)+)| < 25. 

t<T t<T 

Take 5 such that 25 < rj, which yields Il{h : sup 0<i <^ \h(t) — ho(t)\ < rj} > 
Q{G £ G : sup x<r+r \G(x) — Gq(x)\ < 5}, where we recall that Q is the dis- 
tribution induced on G. The right-hand side has positive probability by 
Lemma 11 and, hence, (17) holds. 

The next step consists in showing that any ho, which is bounded Lips- 
chitz continuous of constant K > and satisfies (i) and (ii), can be approx- 
imated in the sup norm on [0, T] to any order of accuracy by a rectangular 
mixture hazard (47) with a sufficiently small bandwidth r. To this end, de- 
fine h m (t) = $\\ t _ x \< Tm) dG m (x) with r m = m~' n (m = 1,2,... and r] > 0) 
and dG m (x) = I( x<m )(2r m ) _1 /io(a;) dx. Note that G m G G for any integer m. 
Hence, we have 

h m {t) = ^—{H Q {m V {t + T m )) - H ((t - T m ) + ))l t<m+Tm 

and h m (t) — > ho(t) for any t as m — > oo. 

Next we apply the Arzela-Ascoli theorem in order to obtain uniform con- 
vergence on a compact [0,T]. Hence we need to show that: (a) the sequence 
{h m } m >i is bounded on [0,T] uniformly in m; (b) {/i m }m>i is an equicon- 
tinuous sequence of functions on [0,T]. See Theorem 3 on page 270 in Feller 
[7]. Condition (a) is implied by Hq being Lipschitz, which is guaranteed by 
the boundedness of ho. Condition (b) boils down to showing that, to each 
£ > 0, there corresponds a 5 > such that 

(48) \t-s\<5 => \hm(t) - h m (s)\ < e 

for all large m. For simplicity we consider r m < s <t <m — T m . Then 

H (t + T m ) - H (t - T m ) Hq{s + T m ) - H (s - T m ) 



\h m {t) - h m (s)\ 



2T m 2i~ m 
= \ho(t*)-h (s*)\ 

for some t* , s* such that t — T m <t* <t + r m and s — r m < s* < s + r m . Next, 
for t* , s* running in these two intervals 

\h (t*) - h (s*)\ < sup \h (t*) - h (s*)\ 

t*,s* 
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< if sup \t* - 

t*,s* 

< K{5 + 2r„ 



K\t + T m - {s-T m )\ 



Finally, for given e, choose too large enough such that T < too — r mo and 
e/K — 2r mo > 0. Then (48) is satisfied for 5 <e/K — 2r mo , and (b) is proved. 

Now fix 7] > 0. There exists m such that sup 0<t <*r \ h m (t) — ho(t)\ < r//2. 
For this to, take Ge{Ge G:sup x<T+Tm \G(x) - G m {x)\ < 5} for 5 < r?/4 
and let Kq be the corresponding hazard rate. Then, one has 



sup|ft G (t) 

t<T 



sup 

t<T 



h m (t)\ 



t+T 



sup|G(t + r) 

t<T 



dG(x) - 
G((t 



t+T 

t-r) + 



dG m (x) 

-G m (t + r) + G ((t 



<sup\G(t + T)-G m (t + T)\+ sup|G((t ■ 

t<T t<T 



r) + )\ 
G m ((t — 



)\<v/2- 



5 yield U{h: sup 0<t < T \h(t) - h (t)\ < rj} > Q{G G G: 
— Gq(x)\ < 5} x 7t{t E (0, r m )}. The right-hand side has pos- 
itive probability by Lemma 11 and the hypotheses on it. Hence, the proof is 
complete. 



Such to and 

sup a .< T+T |G(a: 



Proof of Theorem 6. As for the Ornstein-Uhlenbeck kernel, note that, 
since jl is a.s. discrete, h is a shot-noise process with exponentially decay- 
ing shocks, that is, h(t) = J2i JiV^K exp{— K(t — Xj)}I( <x 4 <t)] where the 
Jj's and AYs are the random shocks and locations, respectively. We first 
aim at showing that any ho of the form (47) satisfying (i) and (ii) can 
be approximated in the sup norm on [0, T] to any order of accuracy by a 
step-wise continuous function with a finite number of jumps. Let G e be 
the step function defined in (46) and h £ the corresponding hazard rate 
h £ {t) = YaLiJi V2Kexp{-n(t - zi)}I(p< Zl < t ). We first prove that 

(49) sup\h {t) - h £ (t)\ < eV2^ 

t<T 

by determining a lower and an upper bound for the difference /io — h e . It 
turns out that the minimum distance ho — h e is attained at one of the jump 
points zis and a lower bound for ho(zi) is obtained by moving the increment 
Gq(z~) — Go(zi-i) near to the right of zi-\ for any i < I. Setting Aj := 
V2k[Go(z^) — Go(zi-i)], we have 

i i 
h e { Zl )-h { Zl ) < KizO-^Goiz^V^e-^-^-^^e-^'-^ 

i=l i=l 
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i=l i=l 
I 

< eV2^Y} e ~ K{Zl ~ Zl) ~ e-^ 1 - 2 '-^] < eV2n. 
i=i 

As for the maximum of ho — h e , an upper bound for ho(t), with t E [zi,zi + i), 
is obtained by moving the increment Go(z~) — Go(-Zj-i) near to the left of 
Zi for i < I and Go(t) — Gq{z{) near to the left of t. Hence, we get 

ho(t) - h £ (t) < h E ( Zl ) + [G (t) - G ( Zl )]V2^-h £ (t) = [G (t) - G (z { )]v^ 

and (49) is proved. Now, take G such that sup x<T \ G e (x) — G{x)\ < 5 and 
denote by h G (t) = f \^2k exp{— n(t — x)}G(dx). We show that 

(50) sup | he (t) - h G (t) | < 25V2k. 

t<T 

Reasoning as for (49), the following bounds for h G {t) can be found 

N 

h G (t) < V^(26 - 5e-^~^ + ) + ^V^"^"^^*), 

N 

h G (t) > V2^(5e- Kt - 25e-< t - z ^)\ Zl < t) +J2ji^-^- z %< Zi < t) 

i=X 

with t e [z h zi + 1), I = 0, . . . ,N - 1, a + = a V and £°=i = 0. Hence, 

V2H(6e- Kt - 25e- K{t - zl) )I {zi < t) < h £ {t) - h G {t) < ^(25 - Se'^-' 1 ^), 
which leads to the following bound in the sup norm 

sup | he (i) - h G (i) | < max{ ^2^(25 - 5e" K ^ Zl > ) , V2^(25 -6e~ KT )}< 25V2~H. 

t<T 

Thus, (50) is proved. Now, by combining (49) and (50), for any G EG such 
that sup x<T \ G(x) — G e (x)\ < 5, we have 

sup|/i (t) - h G (t)\ <sup\h (t) - h s (t)\+sap\h s (t) - h G (t)\ < (e + 25)V~2k. 

t<T t<T t<T 

Now, for any t] > 0, take e and 5 small enough such that (e + 2S)y2~K < r]. 
Hence, we obtain U{h : sup 0<t<T \h(t) — ho(t)\ < rf\ > Q{G € G : sup x<T \G(x) — 
G e (x)\ < 5}. Note that the right-hand side has positive probability by Lemma 
11 and, hence, (17) is proved. Now we show that any differentiable hazard 
rate such that ho(0) = and, according to condition (iii), 

-h' (t)/h (t) < kV2k, 
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can be represented as OU mixture (47) with absolutely continuous /iq. De- 
fine u(x) = h' (x) + Ky/2Kho{x). Such u is a well defined Radon-Nikodym 
derivative of a boundedly finite measure with respect to the Lebesgue mea- 
sure. This follows from the fact that u(x) > for any x by condition (iii) 
and that it is integrable in zero since ^u{x)dx = ho(r) + k^/2kHq(t). 
Set dGo(x) =u(x)dx, where Go is the d.f. associated to liq, and define 
h*(t) := jTq y/2K,e~ K ^~ x ^ dGo(x). Then, both h* and ho are solution of the 
differential equation 

dh(t) = -Ky/2Kh(t) dt + dG (t) with h(0) = 0. 

Thus, they coincide and the proof is complete. 

Proof of Theorem 7. Assume first ho to be an exponential mixture (47) 
satisfying assumptions (i) and (ii). Then, ho is obviously strictly positive 
on R + . As for the uniform bound of \h(t) — ho(t)\, it is useful to write 
ho(t) = J R+ e~ l l x dG' (x) and h(t) = J R+ e^^fi' (dx), where 

pX pX 

G' (x)= z~ l dGo{z) and ji'([0,x])= z~ x ji{dz). 



o 



Note that, by condition (ii) and the assumption that h(0) < oo a.s., G' (x) < 
oo and p,'([0,x]) < oo a.s. for any finite x. Let Q' denote the distribution 
induced on G by /!'. One can check that, if Lemma 11 holds for Go and jx, 
then, for any finite M and r\ > 0, 

(51) Q'/g'gG: sup \G'(x)-G' (x)\< v \>0. 

I x<M J 

We now derive a bound for \h(t) — ho(t) \ by exploiting the uniformly equicon- 
tinuity of the family of functions {e _i//x ,t < T}, as x varies in the compact 
set [0, M] for any T < oo and M < oo. In fact, given 7 > 0, the Arzela-Ascoli 
theorem ensures the existence of finitely many points ti,...,t m such that, 
for any t<T, there is an index i for which 

(52) sup \e~ t/x -e- ti/x \<j. 

x<M 

Now, note that condition (ii) and the assumption that h(0) < 00 a.s., im- 
ply that: (i) for any e\ > 0, there exists M\ < 00 large enough such that 
[dG' (x) < £1; (ii) for any 82 > 0, 3M2 < 00 large enough such that 
Q'{G' : Jm 2 dG'(x) < e 2 } > 0. At this point, take M = M\ VM 2 and note that 
{G':j™ 2 dG'(x)<e 2 }^A{M,e 2 ), where A(M,e 2 ) := {C :J™dG'(x) < e 2 }. 
Finally, define 

S(M,£ 3 ):=(g / gG: sup \G' (x) - G' (x)\ <e 3 \, 
I x<M J 
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which is a set of positive probability for any £3 > by (51), and note 
that Q'[A(M,e 2 ) n B(M,e 3 )] > by the independence of //((0,M]) and 
/2([M,oo)). Take now G' € A(M,e 2 )n.B(M,eg) and let /i G /(i) = / R+ e - */* dG'( 
Then, for an arbitrary t<T, choose the appropriate t{ such that (52) holds 
and write 



x). 



\h G ,(t)-h (t)\ < 



f \e~ t/x -e- u/x \dG'(x)+ f \e~ t/x -e- u/x \dG' (3 



+ 



~ u/x dG'(x) 



h+h + h- 



e- u,x dG' {x 
As for Ix, we have 

/•OO 

h < jG'(M) + / \e- l l x - e~ u/x \ dG'(x) 

JM 

roo 

< jG'(M) + 2 / dG'(x) < 7 [G' (M) +e 3 ] + 2e 2 , 

where we used the fact that ho and h are decreasing in the second step. 
Similar arguments lead to I2 < jG' (M) + 2ei. Concerning ^3, write 



M 



+ 



- u/x dG'(x)- 
e - u/x dG'(x) 



M 



- k/x dG' (x) 
e~ hlx dG' { 



< 



M 



M 



+ e 2 +ei < e 3 + e 2 + ei, 



e - u / x G'{dx) - / e- u l x G' {dx) 
'o jo 

where, in the last step, we have exploited the fact that Q belongs also 
to a weak neighborhood of G' of radius £3, when one reasons in terms 
of finite measures over [0,M]. Summing up, we have obtained \ho'(t) — 
ho(t)\ < 2jG' (M) + 763 + 63 + 3 (e 2 + £i), where G' (M) is a finite constant. 
Hence, we are able to state that, for a given 77, it is always possible to 
choose 7,£i, £2 and £3 such that \h,Qi(t) — ho(t)\ < rj for Q in a set of 
positive probability. To see this, set E\ and £2 such that 3(ei +£2) < ??/4, 
then determine M = M\ V M 2 ; since, for such M, we have G' (M) < 00, set 
7 such that 2jG' (M) < r//4; for such 7 set £3 such that £3(7 + 1) < 77/4. 

The next step consists in establishing that any function completely mono- 
tone function cp on R + such that (p(0) < 00 is of the form 



(53) 



tp(t) 



x 



-t/x 



dG(x) 



where GsG, that is, it is an exponential mixture with respect to a bound- 
edly finite measure. The starting point is the fundamental result of Bern- 
stein, which characterizes completely monotone functions as mixtures, the 
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mixing measures being probability measures. For our needs it is more conve- 
nient to resort to the version of Bernstein's result as formulated in Theorem 
1(a) on page 439 in Feller [7]: a function tp on (0, oo) is completely monotone 
if and only if it is of the form 



where U £ M. Without loss of generality, we may assume that <p ~ L(l/r), 
where L is a slowly varying function at infinity. This clearly covers the case 
of </?'s such that <f(0) < oo, which is one of our assumptions. At this point 
we resort to a suitable Tauberian theorem (see Theorem on page 445 in [7]), 
which allows to deduce the behavior at infinity of U in (54) from the behavior 
in zero of ip. Hence, we have U(t) ~ L(t), as t — ► oo. Let now T{x) = 1/x 
and denote with U o T _1 the image measure of U by T. We can write 



and define G{dy) = y{U o T~ l ){dy). For simplicity, we assume that U(x) 
has an ultimately monotone derivative, that is, U(dx) = u(x) dx with u(x) 
monotone in some interval (xq,oo). Then 



for sufficiently small r. We aim at showing that G(t) — > as r — > 0. In 
fact, U(t) ~ L(t) implies that, for any e > 0, u(t) = o(t £ L(t)) as t — > oo. 
Otherwise, if u(t) ~ Kt £ ~ 1 L(t) for some e* > and constant K, then U(t) ~ 
(K/e*)t e L{t) (see the lemma after Theorem 4 on page 446 in [7]) which, in 
turn, contradicts U(t) ~ Lit). Next we have 



where the integrand is monotone and it remains bounded as r — ► 0. Thus, 
G(t) = o(r 1_£ L(l/r)) for any e > 0, and in particular for e < 1, from which 
the desired result follows. We have then established that any completely 
monotone function <p such that <p(0) < oo is of the form (53). 

Finally, the fact that the moment condition (ii) in Theorem 2 reduces 
to / tfo(t)dt < oo follows from the fact that the function 1 1— > h(t) is a.s. 
decreasing. Hence, the proof is complete. 



(54) 
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Further results and proofs for Section 4. 

Compensated Poisson random measures. In order to prove the results 
concerning functionals of hazard rates, we will often work with the compen- 
sated Poisson random measure canonically associated to a Poisson measure 
N with intensity v. This object is written N c = {N C (A) : A £ ^(IR + ) (g> JT} 
and is defined as the unique CRM on (M+ x X,^(R + ) <g> 3C) such that 

N C (A) = N(A) - u(A) 

for every set A of finite z/-measure. For every g £ L 2 (u), we denote by 

N c (g)= f g(s,x)N c (ds,dx) 

JR+xX 

the Wiener-Ito integral of g with respect to N c . Observe that, for every 
g £ L 2 (v), N c (g) is a centered and square integrable random variable with 
an infinitely divisible law. In particular, for every A £ M, 

(55) E[e iA ^] = exp( / [e^ 9 ^ - 1 - i\g(s,x)}u(ds,dx)\. 

UR+xX J 

Also for every /, g £ L 2 (v), one has the fundamental isometric property 

(56) E[N c (f)N c (g)} = f f(s, x)g(s, x)v{ds, dx) := (/, g) L 2 {v) . 

JR+xX 

Note that (35), (55) and (56) imply that, for every g £ L 2 (v) nL 1 (i/), 
E[N{g)] = f g{s,x)u(ds,dx) 

JR+xX 

Var[iV( 5 )]=Var[A> c ( 5 )]= / g(s, x) 2 v(ds, dx). 

JR+xX 

Limit theorems for shifted measures. In this section we prove a series of 
preliminary CLTs, involving random hazard rates that are obtained from 
h n '* [as defined in (18)] by adding fixed atoms to the underlying CRM fx n '* . 
The notation and framework are those of Sections 2 and 4.1. 

Fix a natural number k > 1, along with points x±, . . . ,Xk £ X such that 
Xi 7^ Xj for every i ^ j, and positive coefficients z\,...,Zk £ M + . We define 
the discrete measure A(-), on (X, 3C) as follows: 

k 

(57) A(B)=J2z j 6 XJ (B), B£JT, 

i=i 

where 5^ stands for the Dirac mass concentrated at y. Now set fi^*(B) = 
fL n '*(B) + A(B), for B £ , where /2 n '* is the CRM appearing in (12), and 
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also 




Note that, with the notation introduced in (58), one has that the cumu- 
lative hazard rate with fixed random jumps Hj\n,* (T) is indeed such that 
H&n,*(T) = H^ ntt ,(T); however, this heavy notation is avoided henceforth. 

Our aim is now to establish CLTs for linear and quadratic functionals of 
the transformed hazard rate h&(-). These results represent the "determinis- 
tic skeleton" upon which the conditioned CLTs of Section 4 are constructed. 
Note that the random measure fi^* is a CRM with fixed atoms (given by the 
points x\, . . . , Xfc), so that one cannot apply directly the theories developed 
in [27, 28]. An integer n > is fixed for the rest of the section. 

Proposition 12. Suppose that points (i) and (ii) in the statement of 
Theorem 8 are satisfied for n > 0. Assume moreover that 

k . T 

(59) lim C (n,k,T) x Vz, / k(t, xA dt = m(n, A, k) G [0, +oo). 

Then, letting X ~ jV(m{n, A, k),o~Q(n,k)), we have 

C (n,k,T) x [H%*(T) -E[H n >*(t) 2 ]} ^X. 

Before proving the result, it is worth pointing out that (59) only involves 
deterministic quantities, and also that we do not suppose (29) to hold. 

PROOF. First, write 

Co(n,k,T)x[H%*(T)-nH n >*(t) 2 }} 

= C (n,k,T) x [H n '*(T)-E[H n '*(t) 2 }] 




Now observe that H n '*(T) is the cumulative hazard rate obtained from a 
CRM with intensity u n '* . As a consequence, according to Theorem 1 in [27], 
whenever conditions (i) and (ii) of Theorem 8 are verified, one has that the 
sequence C (n, k, T) x [H n <*(T) -E[H n >* (t) 2 ]] converges in law to a Gaussian 
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random variable with variance Uq(ti, k). Since (59) holds by assumption, and 
since m(n, A, k) is deterministic, the conclusion follows. □ 

Now, define the kernel 4a as in (26), by simply replacing A™'* with A, 
that is: 

k 

4 a( S ' x ) := ^2 4 ( s > x ' z j' ( s > x ) e ^ + x ^ > 

i=i 

Proposition 13. Suppose that all the assumptions in the statement of 
Theorem 9 are satisfied, except for points 5. , 6. and 7. , which are replaced, 
respectively, by 

5b. C?(n,fc,T)||4 2) +24 3 l + 2 4 4 Alli>",*) -Krl(n,A,k) > 0; 

6b. Cf(n,A:,r)||4 2) +24 3 L + 24Alli 3 (^-) 

Ci(n,k,T) 



7K lim 

7b. t^+oo T 



T / k y 

J \^2 z jH^ x j)J dt = v(n, A, k) € [0, +oo) 



Then, letting X ~ jY{v(n, A, fc), erf (n, A;) + 04(71, A, A;)) , we /iai>e 
d(n,A;,r)x|i £h¥(t)*dt 

(60) [ T E[h n >*(t)]k(t, Xj )dt 

- ^ j\[h n '*(t) 2 ]dt^ ^ X. 

Proof. Denote by TV" - '* the Poisson measure on M + x X, with intensity 
v n '* , determining /2 n '*. We also write N C]n '* to indicate the compensated 
Poisson measure associated with N n '* . First observe that, by, for example, 
Lemma 1 in [27], 



T h n -*{t)Y^z j k^yj) dt 



r 







law i r T N^*((.) k (t,-))Y: zjHt, Xj )dt 

TJo j=1 

1 r T r k 
+ — / / sk(t,x)v n '* (ds,dx)} Zjk(t,Xj) dt 
J- Jo Jr+xX ~rl 



i=i 
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A fW (4 4) A)+E| / nh n '*(t)]k(t, Xj )dt, 



where N c ' n >*((-)k(t, •)) := J R+xX sk(t, x)N c ' n >*(ds, dx). From the previous re- 
lations, we deduce 



i r T 1 r T i r T t k ^ 2 



T 

(61) 



j o hr(t) 2 dt i ^±j o { h n >*(t)fdt+±j o ^jk(t, Xj )j dt 



+ N c '' n '*(2k^ A ) + 2jj|jf E[/i n '*(t)]A;(t, x,-) dt. 
Now apply the calculations contained in [27], Section 5.3, to deduce that 

i f T h n <*(t) 2 dt-± [ T E[h n <*(t) 2 ]dt 

(62) T7 ° T7 ° 

1 i= w iv c ^(4 2) +24 3 ^+/ 2 (4 1) ), 

where 7 2 stands for a double Poisson integral with respect to N c ' ,n '* (see [27] 
or [28] for further details). From the last formula and from (61) we infer 
that the expression in (60) has indeed the same law as 

Ci(n, k,T)[N c ' n '*(kP + 24 3) n , + 24 4 a) + H^)] 
+ C 1 (n,k,T)^ (j^Zjk&Xj^J dt. 

To justify the operation of "plugging" the equality in law (62) into (61), 

one can use the more general relation: h n '*(t) l = J R+ xX sk(t, x) N n '* (ds , dx) , 
where the equality holds in the sense of stochastic processes; see again 
Lemma 1 in [27]. Now we can apply directly Theorem 3 in [28] to deduce 
that, under the assumptions in the statement, the pair 

din, fe,T)(7V c;n -*(4 2) + 24 3 i + 24 4 ,t)> / 2(4 1) )) 

converges in law to (N, N') where N, N' are two independent centered Gaus- 
sian random variables with variances given, respectively, by af(n,k) and 
cr|(n, A,k). Since v(n, A, k) is deterministic, the conclusion follows. □ 

Proposition 14. Suppose that the assumptions of Propositions 12 and 
13 are verified. Assume also that points 1. and 2. in the statement of Theo- 
rem 10 hold and that point 3. in the same statement is replaced by 

3b: \\Cx(n, k, T)(4 2) + + 2fc£j0 - S(n, k)C (n, k, T)kf 

-^a1(n,A,k) > 0. 
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Then, letting X ~ Jf(v (n, A, k) — 8(n, k)m(n, A, k),al(n, k) + a\ (n, A, k)), 



V{T) :=C 1 (n,k,T)< 



Tjo 

k 



T 



Ht*(T) 



T 



dt 



(63) 



Y^r [ T nh n '*(t)]k(t, yj )dt 



Proof. Throughout the proof, we use the symbol A(T) ~ B(T) to in- 
dicate that A{T) — B(T) converges to zero in probability. First observe that 

Cl (n,k,T)(^^ 2 

_Ci(n,fc,T){Co(n,fc,T)[^(^)-]E(ff"'*(T))]} 2 



+ 



T>C (n,k,Tf 

Cl(«^ 5 T ) TI? ,£V„,*^N2 



J"2 



+ 2 Cl T) E(H n >* (T) ) [Hi* (T) - E(ff (T))] . 

Since point 1 in the statement of Theorem 10 is verified, and since the 
assumptions of Proposition 12 are in order, we deduce 

^t^l_ { c Q {k,T)[H{T) - E(H(T))]} 2 $ 0. 
Moreover, point 2 in the statement of Theorem 10 yields that, as T ^ +oo, 
2Ci (n,k, T) E ^ n ,* ( T ))[H%* (T) - E{H n >*(T))] 

I 5(n, k)C (n, k,T)[H%*(T) - E(H(T))]. 

Now consider the functional V(T) defined in (63). By reasoning as in the 
proofs of Propositions 12 and 13 we deduce that 

V(T) I d(n,k t T) 



1 rT 



x \tJ [hT(ty]dt- 6(n,k)C (n,k,T)[H%*(T) -E(H(T))] 



Y,^£nh n ^t)]k(t, yj )dt-± J\[h n '*{tf]dt 
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iV c;n '*(Ci(n, A;,T)(4 2) + 2k^ n + 2k^ A ) - 5(n,k)C (n,k,T)kP) 

k 

10 



+ I 2 (Ci(n,k,T)kft) - 5(n,k)C (n,k,T)J2zj / k(t,Xj) 
d(n,k,T) f T {* , . A 2 , 



and the conclusion is again obtained from Theorem 3 in [28]. □ 

Proofs of Theorems 8, 9 and 1 0. For the sake of brevity, we only provide 
the complete proof of Theorem 8. From point (i) of Theorem 1 we deduce 

E[e iXC (nAT)[H(T)-E[H^(T)]] lx * = ^ _ _ _ ^ y] 

= E[exp( i XCo(n,k,T)[Hr j (T)-E[H^(T)]])], 

where 

Hl*(T) := (H A n,* (T)|X* = (xi, . . . , x k )) = H n >*(T) + ]T J * T K^i) alt. 

i=i J ° 

H n '*(T) and H&n,* (T) are defined in (19), and the jump vector J = (Ji, . . . , J*.) 
is independent of H n '*(T) and with law given by (14). The previous relations 
and independence yield that 

E[e aCo(n, fc ,T) { ^(T)-E[H-(T)] }| j = . . . , X * = (^ . . . ,**), Y] 

(64) 

= E r e iAC (n,fc,T){H2'*(T)-E[H".*(T)]}] ) 

where H A *(T) is defined in (58) and A is given by (57). Now suppose that 
the assumptions of Theorem 8 are met [in particular, (29)]. This implies 
that there exists a set f2' of P-probability one such that, for every u 6 f2', 
the probability B \— > P{X* € -B|Y} has support contained in the set of those 
vectors (x±, . . . , Xk) such that, for every fixed (z±, . . . , Zk) in the support of the 
law of J , the cumulative hazard rate H A * (T) appearing in (64) verifies the 
assumptions of Proposition 12 [in particular, (59) holds with m(n, A, k) = 
m(n, A n '*, k)]. This yields, for all such (x±, . . . , Xk) and (zi, . . . , Zk), 

E[ei ACo(n, fc ,T)[H(T)-E[H-(T)]] | J = Zfc ), X * = (x l5 ... , X k ), Y] 

/ A 2 \ 

^ — > exp (i\m(n, A n '*,k) — —a^n, k)j . 

To conclude, it is sufficient to use the Dominated Convergence Theorem for 
conditional expectations, to obtain that, a.s.-P, 

E [ e tACb(r»,*,T)[H(r)-K[fl w -"(T)]]|Y] 
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: ^[ e ^Co(n,k,T)[H(T)~E[H«'*(T)}} | X *, Y]|Y] 

:E[E[E[e iACt '("' fc ' T )^( r )^ E ^'*( T )]]|J,X*,Y]|X*,Y]|Y] 



E 



T^oo 



E 



exp( i\m(n, A n '*, k){uj) —a^n, k) 



X*,Y 



E 



exp ( i\m(n, A n '* ,k)(u>) — — cr (n, k) 



thus completing the proof. 

The proofs of Theorem 9 and 10 can be obtained by using exactly the 
same line of reasoning and by applying, respectively, Propositions 13 and 
14. 
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